Power and Predictive Accuracy of Polygenic Risk Scores
نویسنده
چکیده
Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sample by forming the weighted sum of associated alleles within each subject. Association between a trait and this composite score implies that a genetic signal is present among the selected markers, and the score can then be used for prediction of individual trait values. This approach has been used to obtain evidence of a genetic effect when no single markers are significant, to establish a common genetic basis for related disorders, and to construct risk prediction models. In some cases, however, the desired association or prediction has not been achieved. Here, the power and predictive accuracy of a polygenic score are derived from a quantitative genetics model as a function of the sizes of the two samples, explained genetic variance, selection thresholds for including a marker in the score, and methods for weighting effect sizes in the score. Expressions are derived for quantitative and discrete traits, the latter allowing for case/control sampling. A novel approach to estimating the variance explained by a marker panel is also proposed. It is shown that published studies with significant association of polygenic scores have been well powered, whereas those with negative results can be explained by low sample size. It is also shown that useful levels of prediction may only be approached when predictors are estimated from very large samples, up to an order of magnitude greater than currently available. Therefore, polygenic scores currently have more utility for association testing than predicting complex traits, but prediction will become more feasible as sample sizes continue to grow.
منابع مشابه
Predictive accuracy of combined genetic and environmental risk scores
The substantial heritability of most complex diseases suggests that genetic data could provide useful risk prediction. To date the performance of genetic risk scores has fallen short of the potential implied by heritability, but this can be explained by insufficient sample sizes for estimating highly polygenic models. When risk predictors already exist based on environment or lifestyle, two key...
متن کاملAccuracy of obesity indices alone or in combination for prediction of diabetes: A novel risk score by linear combination of general and abdominal measures of obesity
Background: The predictive power of obesity measures varies according to the presence of coexistent measures. The present study aimed to determine the predictive power of combinations of obesity measures for diabetes by calculation of a linear risk score. Methods: Data from a population-based cross-sectional study of 994 representative samples of Iranian adults in Babol, Iran were analyzed. Me...
متن کاملPolygenic Selection, Polygenic Scores, Spatial Autocorrelation and Correlated Allele Frequencies. Can We Model Polygenic Selection on Intellectual Abilities?
The majority of polygenic selection signal of educational attainment GWAS hits is confined to a handful of SNPs within genomic regions replicated across GWAS publications. A polygenic score comprising 9 SNPs predicts population IQ (r=0.9), outperforming 99.9% of the polygenic scores obtained from sets of random SNPs. Its predictive power remains unaffected after controlling for spatial autocorr...
متن کاملEvidence for Recent Polygenic Selection on Educational Attainment and Underlying Cognitive Abilities Inferred from GWAS Hits: A Monte Carlo Simulation Using Random SNPs
Background: The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model. Methods: Average frequencies of alleles with positive effect (polygenic scores or PS) were compared across populations (N=26) using data from 1000 Genomes. A null model was created using frequencies of random SNPs. Results: Po...
متن کاملCan we detect polygenic selection on cognitive ability using GWAS hits? Employing random SNPs as a null model
Background: The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model. ethods: Average frequencies of alleles with positive effect (polygenic scores or PS) were compared across populations (N=26) using data from 1000 Genomes. A null model was created using frequencies of random SNPs. Results: Pol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 9 شماره
صفحات -
تاریخ انتشار 2013